The original preregistrations for the studies contained both hypotheses and the specific analytic strategies that would be used to test them. However, these preregistrations did not include a meta-analytic strategy. Separately, a number of research questions/hypotheses were generated from exploration of the data from Experiments 1-6 that were not contained in the original preregistration, or where the specific analytic strategy to test them was poorly specified or more difficult to interpret. Separately, some methodological improvements were thought of after Experiments 1-6 was run (e.g., improved exclusion criteria to ensure participants stayed on the page where they watched/listened to the intervention in its entirety). We therefore elected to use the data from Experiments 1-6 to create this (non-preregistered) alternative analytic strategy that formalized our core research questions, hypotheses, analytic models, inference rules, and other researcher degrees of freedom. This analytic strategy (and code to implement it) will be preregistered for Experiment 7 which will provide strong confirmatory tests of these hypotheses.

Dependencies & functions

# dependencies
library(tidyverse)
library(knitr)
library(kableExtra)
library(brms)
library(parallel)
library(tidybayes)
library(bayestestR)
library(sjPlot)
library(psych)
library(rsample)
library(broom)
library(purrr)
library(IATscores)
library(lavaan)
library(semTools)
library(modelr)
library(furrr)

# set up parallel processing
future::plan(multiprocess)

# knitr options
options(knitr.kable.NA = "/")

# set seed for bootstrapping reproducibility
set.seed(42)

# create necessary folder
dir.create("models")

Exclusions & standaridization

All dependent variables (self-reported evaluations and IAT D2 scores) were standardized (by 1 SD) after exclusions and prior to analysis condition (see Lorah, 2018: https://doi.org/10.1186/s40536-018-0061-2). This was done within each level of both IV (i.e., by Source Valence condition [positive vs. negative], and by Video Content [Genuine vs. Deepfaked]). As such, the beta estimates obtained from the Bayesian models (see research questions and data analysis plans below) therefore represent standardized beta values (\(\beta\) rather than \(B\)). More importantly, the nature of this standardization makes these estimates somewhat comparable to the frequentist standardized effect size metric Cohen’s \(d\), as both are a differences in (estimated) means as a proportion of SD although they should not be treated as equivalent. Effect size magnitude here can therefore be thought of along comparable scales as Cohen’s \(d\). As such, to aid interpretability, the point estimates of effect size will be reported as \(\delta\) (delta).

# full data
data_processed <- read.csv("../data/processed/4_data_participant_level_with_hand_scoring.csv") %>%
  # include only experiments 1-6
  filter(experiment %in% c(1, 2, 3, 4, 5, 6)) %>%
  # set factor levels for t test comparisons
  mutate(source_valence = fct_relevel(source_valence,
                                      "negative",
                                      "positive"),
         experiment_condition = fct_relevel(experiment_condition,
                                            "genuine",
                                            "deepfaked"),
         experiment = as.factor(experiment))

# apply exclusions
data_after_exclusions <- data_processed %>%
  filter(exclude_subject == FALSE & 
           exclude_implausible_intervention_linger == FALSE) %>%
  # standardize DVs by 1SD within each experiment and their conditions
  group_by(experiment, experiment_condition, source_valence) %>%
  mutate(mean_self_reported_evaluation = mean_self_reported_evaluation/sd(mean_self_reported_evaluation),
         IAT_D2 = IAT_D2/sd(IAT_D2),
         mean_intentions = mean_intentions/sd(mean_intentions)) %>%
  ungroup()

# item level for iat
data_iat_item_level_after_exclusions <- read_csv("../data/processed/2.4_data_iat_item_level.csv") %>%
  # exclude the same participants as above
  semi_join(rename(data_after_exclusions, subject_original = subject), by = "subject_original") 

Distributions

ggplot(data_after_exclusions, aes(mean_self_reported_evaluation, color = experiment)) +
  geom_density() +
  facet_wrap( ~ experiment_condition + source_valence) +
  ggtitle("Standardized scores")

ggplot(data_after_exclusions, aes(IAT_D2, color = experiment)) +
  geom_density() +
  facet_wrap( ~ experiment_condition + source_valence) +
  ggtitle("Standardized scores")

ggplot(data_after_exclusions, aes(mean_intentions, color = experiment)) +
  geom_density() +
  facet_wrap( ~ experiment_condition + source_valence) +
  ggtitle("Standardized scores")

Demographics

Pre exclussion

data_processed %>%
  group_by(experiment) %>%
  summarise(n = n(),
            excluded_n = sum(exclude_subject > 0 | exclude_implausible_intervention_linger > 0),
            excluded_percent = (excluded_n / n) *100) %>%
  mutate_if(is.numeric, round, digits = 1) %>%
  kable(align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
experiment n excluded_n excluded_percent
1 165 24 14.5
2 167 36 21.6
3 428 91 21.3
4 429 106 24.7
5 276 66 23.9
6 265 61 23.0

Post exclusions

data_after_exclusions %>%
  group_by(experiment) %>%
  summarise(n = n(),
            age_mean = mean(age, na.rm = TRUE),
            age_sd = sd(age, na.rm = TRUE)) %>%
  mutate_if(is.numeric, round, digits = 1) %>%
  kable(align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
experiment n age_mean age_sd
1 141 29.7 7.6
2 131 31.1 7.3
3 337 29.8 8.7
4 323 30.1 9.0
5 210 31.2 11.5
6 204 33.3 12.5
data_after_exclusions %>%
  count(experiment, gender) %>%
  spread(gender, n) %>%
  kable(knitr.kable.NA = "/", align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
experiment female male Non-binary other Prefer not to disclose
1 67 73 / 1 /
2 76 55 / / /
3 184 149 / 4 /
4 189 132 / 2 /
5 119 88 2 / 1
6 120 82 2 / /

Internal consistency

Self-reported evaluations

model_sr <- "scale =~ ratings_bad_good + ratings_dislike_like + ratings_negative_positive" 

fit_cfa_sr <- data_after_exclusions %>%
  cfa(model = model_sr, data = .) 

results_reliability_sr <- fit_cfa_sr %>%
  reliability() %>%
  as.data.frame() %>%
  rownames_to_column(var = "metric") %>%
  select(metric, estimate = scale) %>%
  filter(metric %in% c("alpha",
                       "omega2")) %>%
  mutate(metric = recode(metric,
                         "alpha" = "alpha",
                         "omega2" = "omega_t"),
         estimate = round(estimate, 3))

results_reliability_sr %>%
  kable(knitr.kable.NA = "/", align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
metric estimate
alpha 0.987
omega_t 0.987

IAT

split half

results_iat_split_half_reliability <- data_iat_item_level_after_exclusions %>%
  SplitHalf.D2(IATdata = .) %>%
  mutate(algorithm = ifelse(algorithm == "p2112", "D2", algorithm),
         splithalf = round(splithalf, 3))
## [1] "2020-12-02 01:48:20: Applying parameter P4 = dist"
## [1] "2020-12-02 01:48:20: Applying parameters P1 and P2"
## [1] "2020-12-02 01:48:20: Applying parameter P3 = dscore"
## [1] "2020-12-02 01:48:20: Applying parameters P1 and P2"
## [1] "2020-12-02 01:48:21: Applying parameter P3 = dscore"
## [1] "2020-12-02 01:48:21: IAT scores have been computed"
## [1] "2020-12-02 01:48:21: Applying parameter P4 = dist"
## [1] "2020-12-02 01:48:21: Applying parameters P1 and P2"
## [1] "2020-12-02 01:48:21: Applying parameter P3 = dscore"
## [1] "2020-12-02 01:48:22: Applying parameters P1 and P2"
## [1] "2020-12-02 01:48:22: Applying parameter P3 = dscore"
## [1] "2020-12-02 01:48:22: IAT scores have been computed"
results_iat_split_half_reliability %>%
  kable(knitr.kable.NA = "/", align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
algorithm splithalf
D2 0.839

Behavioral intentions

model_bi <- "scale =~ behavioral_intentions_share + behavioral_intentions_subscribe + behavioral_intentions_recommend" 

fit_cfa_bi <- data_after_exclusions %>%
  cfa(model = model_bi, data = .) 

results_reliability_bi <- fit_cfa_bi %>%
  reliability() %>%
  as.data.frame() %>%
  rownames_to_column(var = "metric") %>%
  select(metric, estimate = scale) %>%
  filter(metric %in% c("alpha",
                       "omega2")) %>%
  mutate(metric = recode(metric,
                         "alpha" = "alpha",
                         "omega2" = "omega_t"),
         estimate = round(estimate, 3))

results_reliability_bi %>%
  kable(knitr.kable.NA = "/", align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
metric estimate
alpha 0.941
omega_t 0.941

RQ1 & 2: Can online audio/video content establish first impressions, even when it’s not real?; Are Deepfakes just as good as genuine online content at establishing first impressions?

  • Analyses employ Bayesian multilevel models with experiment employed as a random (Group level) intercept, and source_valence, experiment_condition and their interaction as IVs. This could therefore be described as akin to a Bayesian multilevel ANOVA.
  • DVs were standardize as noted above, and as such fitted model estimates represent standardized beta values (which due to the specifics of the standardization have comparable [but not exact] interpretation as Cohen’s d values).
  • Bayesian p values are also reported: these are on a similar scale to frequentist p values, but technically are 1 minus the posterior probability that the effect is greater than 0, i.e., \(1 - P(\beta>0)\).
  • Inspection of the posterior distributions allow us to infer that we employed weak priors placed on all parameters (normal distribution with M = 0 and SD = 10). Inspection of the chains indicated good convergence in all cases.

Sample sizes

data_after_exclusions %>%
  select(source_valence, 
         experiment_condition) %>%
  drop_na() %>%
  count(experiment_condition,
        source_valence) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
experiment_condition source_valence n
genuine negative 382
genuine positive 425
deepfaked negative 257
deepfaked positive 282

Self-reported evaluations

Fit model

fit_exploratory_selfreport <-
  brm(formula = mean_self_reported_evaluation ~ source_valence * experiment_condition + (1 | experiment),
      family = gaussian(),
      data    = data_after_exclusions,
      file    = "models/fit_exploratory_selfreport",
      prior   = prior(normal(0, 10)),
      iter    = 10000,
      warmup  = 3000,
      control = list(adapt_delta = 0.99),  # to avoid divergent transitions
      chains  = 4,
      cores   = parallel::detectCores())

Inspect convergence

summary(fit_exploratory_selfreport)
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: mean_self_reported_evaluation ~ source_valence * experiment_condition + (1 | experiment) 
##    Data: data_after_exclusions (Number of observations: 1346) 
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
##          total post-warmup samples = 28000
## 
## Group-Level Effects: 
## ~experiment (Number of levels: 6) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.18      0.11     0.06     0.46 1.00     6283     9291
## 
## Population-Level Effects: 
##                                                      Estimate Est.Error
## Intercept                                               -1.48      0.10
## source_valencepositive                                   2.71      0.07
## experiment_conditiondeepfaked                            0.08      0.09
## source_valencepositive:experiment_conditiondeepfaked     0.01      0.11
##                                                      l-95% CI u-95% CI Rhat
## Intercept                                               -1.68    -1.28 1.00
## source_valencepositive                                   2.56     2.85 1.00
## experiment_conditiondeepfaked                           -0.09     0.25 1.00
## source_valencepositive:experiment_conditiondeepfaked    -0.22     0.23 1.00
##                                                      Bulk_ESS Tail_ESS
## Intercept                                                7616     9466
## source_valencepositive                                  15458    18050
## experiment_conditiondeepfaked                           14302    17335
## source_valencepositive:experiment_conditiondeepfaked    13679    17151
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     1.03      0.02     1.00     1.07 1.00    22400    18545
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit_exploratory_selfreport, ask = FALSE)

Check informativeness of prior

Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.

check_prior(fit_exploratory_selfreport) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter Prior_Quality
b_Intercept uninformative
b_source_valencepositive uninformative
b_experiment_conditiondeepfaked uninformative
b_source_valencepositive.experiment_conditiondeepfaked uninformative

Interpret posteriors

# plot_model(fit_exploratory_selfreport)
plot_model(fit_exploratory_selfreport, type = "pred", terms = c("source_valence", "experiment_condition"))

# percent moderation
draws_sr <-
  bind_cols(
    select(spread_draws(fit_exploratory_selfreport, b_source_valencepositive), b_source_valencepositive),
    select(spread_draws(fit_exploratory_selfreport, b_experiment_conditiondeepfaked), b_experiment_conditiondeepfaked),
    select(spread_draws(fit_exploratory_selfreport, `b_source_valencepositive:experiment_conditiondeepfaked`), `b_source_valencepositive:experiment_conditiondeepfaked`)
  ) %>%
  rename(main_valence = b_source_valencepositive,
         main_experiment_condition = b_experiment_conditiondeepfaked,
         interaction = `b_source_valencepositive:experiment_conditiondeepfaked`) %>%
  mutate(effect_genuine = main_valence,
         effect_deepfaked = main_valence + main_experiment_condition + interaction,
         #percent_moderation = (main_experiment_condition + interaction)/main_valence *100,  # alt method, same result
         percent_comparison = (effect_deepfaked/effect_genuine)*100)

# results
estimates_sr <-
  map_estimate(draws_sr) %>%
  full_join(bayestestR::hdi(draws_sr, ci = .95) %>%
              rename(CI_95_lower = CI_low,
                     CI_95_upper = CI_high) %>%
              as_tibble(),
            by = "Parameter") %>%
  full_join(bayestestR::hdi(draws_sr, ci = .90) %>%
              as_tibble() %>%
              rename(CI_90_lower = CI_low,
                     CI_90_upper = CI_high),
            by = "Parameter") %>%
  full_join(draws_sr %>%
              select(-percent_comparison) %>%
              gather(Parameter, value) %>%
              group_by(Parameter) %>%
              summarize(pd = mean(value > 0)) %>%
              mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
              ungroup() %>%
              select(Parameter, p),
            by = "Parameter") %>%
  select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper,
         CI_90_lower, CI_90_upper, p)

# results table
estimates_sr %>%
  mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter MAP_Estimate CI_95_lower CI_95_upper CI_90_lower CI_90_upper p
main_valence 2.71 2.56 2.85 2.58 2.82 0.0000000
main_experiment_condition 0.09 -0.09 0.25 -0.06 0.23 0.1794643
interaction 0.01 -0.21 0.24 -0.18 0.20 0.4717143
effect_genuine 2.71 2.56 2.85 2.58 2.82 0.0000000
effect_deepfaked 2.80 2.63 2.96 2.66 2.93 0.0000000
percent_comparison 102.84 97.19 109.21 98.39 108.42 /
# hypothesis testing
H1a <- ifelse((estimates_sr %>% filter(Parameter == "effect_genuine") %>% pull(CI_95_lower)) > 0, 
              "Accepted", "Rejected")

H1b <- ifelse((estimates_sr %>% filter(Parameter == "effect_deepfaked") %>% pull(CI_95_lower)) > 0, 
              "Accepted", "Rejected")

H2a <- ifelse((estimates_sr %>% filter(Parameter == "effect_deepfaked") %>% pull(CI_90_lower)) > 
                (estimates_sr %>% filter(Parameter == "effect_genuine") %>% pull(CI_95_lower)), 
              "Accepted", "Rejected")

comparison_string_sr <-
  paste0("Deepfakes are ",
         estimates_sr %>% filter(Parameter == "percent_comparison") %>% pull(MAP_Estimate) %>% round(1),
         "% (95% CI [",
         estimates_sr %>% filter(Parameter == "percent_comparison") %>% pull(CI_95_lower) %>% round(1),
         ", ",
         estimates_sr %>% filter(Parameter == "percent_comparison") %>% pull(CI_95_upper) %>% round(1),
         "]) as effective as genuine content in establishing self-reported evaluations")

H1a

The content of the genuine videos (i.e., Source Valence) will influence participants’ self-reported evaluations. Specifically, we will use a Bayesian linear model (model 1) to estimate a 95% Confidence Interval on standardized effect size change in self-reported evaluations between Source Valence conditions in the genuine video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis.

  • Result: Accepted

H1b

The content of the Deepfaked videos (i.e., Source Valence) will influence participants’ self-reported evaluations. Specifically, we will use a Bayesian linear model (model 1) to estimate a 95% Confidence Interval on standardized effect size change in self-reported evaluations between Source Valence conditions in the Deepfaked video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis.

  • Result: Accepted

H2a

Change in self-reported evaluations (i.e., between Source Valence conditions) induced by Deepfaked video content will be non-inferior to genuine content.

  • Result: Accepted. Deepfakes are 102.8% (95% CI [97.2, 109.2]) as effective as genuine content in establishing self-reported evaluations.

Implicit

Fit model

fit_exploratory_implicit <-
  brm(formula = IAT_D2 ~ source_valence * experiment_condition + (1 | experiment),
      family = gaussian(),
      data    = data_after_exclusions,
      file    = "models/fit_exploratory_implicit",
      prior   = prior(normal(0, 10)),
      iter    = 10000,
      warmup  = 3000,
      control = list(adapt_delta = 0.99),  # to avoid divergent transitions
      chains  = 4,
      cores   = parallel::detectCores())

Inspect convergence

summary(fit_exploratory_implicit)
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: IAT_D2 ~ source_valence * experiment_condition + (1 | experiment) 
##    Data: data_after_exclusions (Number of observations: 1346) 
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
##          total post-warmup samples = 28000
## 
## Group-Level Effects: 
## ~experiment (Number of levels: 6) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.20      0.12     0.07     0.50 1.00     5922     9310
## 
## Population-Level Effects: 
##                                                      Estimate Est.Error
## Intercept                                                0.05      0.11
## source_valencepositive                                   1.32      0.07
## experiment_conditiondeepfaked                            0.07      0.08
## source_valencepositive:experiment_conditiondeepfaked    -0.00      0.11
##                                                      l-95% CI u-95% CI Rhat
## Intercept                                               -0.17     0.26 1.00
## source_valencepositive                                   1.19     1.46 1.00
## experiment_conditiondeepfaked                           -0.10     0.23 1.00
## source_valencepositive:experiment_conditiondeepfaked    -0.22     0.22 1.00
##                                                      Bulk_ESS Tail_ESS
## Intercept                                                7697     8823
## source_valencepositive                                  16144    18114
## experiment_conditiondeepfaked                           15002    18404
## source_valencepositive:experiment_conditiondeepfaked    14394    17046
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     1.00      0.02     0.97     1.04 1.00    21725    18539
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit_exploratory_implicit, ask = FALSE)

Check informativeness of prior

Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.

check_prior(fit_exploratory_implicit) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter Prior_Quality
b_Intercept uninformative
b_source_valencepositive uninformative
b_experiment_conditiondeepfaked uninformative
b_source_valencepositive.experiment_conditiondeepfaked uninformative

Interpret posteriors

#plot_model(fit_exploratory_implicit)
plot_model(fit_exploratory_implicit, type = "pred", terms = c("source_valence", "experiment_condition"))

# percent moderation
draws_imp <-
  bind_cols(
    select(spread_draws(fit_exploratory_implicit, b_source_valencepositive), b_source_valencepositive),
    select(spread_draws(fit_exploratory_implicit, b_experiment_conditiondeepfaked), b_experiment_conditiondeepfaked),
    select(spread_draws(fit_exploratory_implicit, `b_source_valencepositive:experiment_conditiondeepfaked`), `b_source_valencepositive:experiment_conditiondeepfaked`)
  ) %>%
  rename(main_valence = b_source_valencepositive,
         main_experiment_condition = b_experiment_conditiondeepfaked,
         interaction = `b_source_valencepositive:experiment_conditiondeepfaked`) %>%
  mutate(effect_genuine = main_valence,
         effect_deepfaked = main_valence + main_experiment_condition + interaction,
         #percent_moderation = (main_experiment_condition + interaction)/main_valence *100,  # alt method, same result
         percent_comparison = (effect_deepfaked/effect_genuine)*100)

# results table
estimates_imp <-
  map_estimate(draws_imp) %>%
  full_join(bayestestR::hdi(draws_imp, ci = .95) %>%
              rename(CI_95_lower = CI_low,
                     CI_95_upper = CI_high) %>%
              as_tibble(),
            by = "Parameter") %>%
  full_join(bayestestR::hdi(draws_imp, ci = .90) %>%
              as_tibble() %>%
              rename(CI_90_lower = CI_low,
                     CI_90_upper = CI_high),
            by = "Parameter") %>%
  full_join(draws_imp %>%
              select(-percent_comparison) %>%
              gather(Parameter, value) %>%
              group_by(Parameter) %>%
              summarize(pd = mean(value > 0)) %>%
              mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
              ungroup() %>%
              select(Parameter, p),
            by = "Parameter") %>%
  select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper,
         CI_90_lower, CI_90_upper, p)

estimates_imp %>%
  mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter MAP_Estimate CI_95_lower CI_95_upper CI_90_lower CI_90_upper p
main_valence 1.33 1.18 1.46 1.21 1.44 0.0000000
main_experiment_condition 0.07 -0.10 0.23 -0.07 0.21 0.1996786
interaction 0.00 -0.23 0.21 -0.18 0.19 0.4972857
effect_genuine 1.33 1.18 1.46 1.21 1.44 0.0000000
effect_deepfaked 1.41 1.23 1.55 1.26 1.53 0.0000000
percent_comparison 104.46 93.73 117.95 95.34 115.66 /
# hypothesis testing
H1c <- ifelse((estimates_imp %>% filter(Parameter == "effect_genuine") %>% pull(CI_95_lower)) > 0, 
              "Accepted", "Rejected")

H1d <- ifelse((estimates_imp %>% filter(Parameter == "effect_deepfaked") %>% pull(CI_95_lower)) > 0, 
              "Accepted", "Rejected")

H2b <- ifelse((estimates_imp %>% filter(Parameter == "effect_deepfaked") %>% pull(CI_90_lower)) > 
                (estimates_imp %>% filter(Parameter == "effect_genuine") %>% pull(CI_95_lower)), 
              "Accepted", "Rejected")

comparison_string_imp <-
  paste0("Deepfakes are ",
         estimates_imp %>% filter(Parameter == "percent_comparison") %>% pull(MAP_Estimate) %>% round(1), 
         "% (95% CI [",
         estimates_imp %>% filter(Parameter == "percent_comparison") %>% pull(CI_95_lower) %>% round(1),
         ", ",
         estimates_imp %>% filter(Parameter == "percent_comparison") %>% pull(CI_95_upper) %>% round(1),
         "]) as effective as genuine content in establishing self-reported evaluations")

H1c

The content of the genuine videos (i.e., Source Valence) will influence participants’ IAT D2 scores. Specifically, we will use a Bayesian linear model (model 2) to estimate a 95% Confidence Interval on standardized effect size change in IAT D2 scores between Source Valence conditions in the genuine video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis.

  • Result: Accepted

H1d

The content of the Deepfaked videos (i.e., Source Valence) will influence participants’ IAT D2 scores. Specifically, we will use a Bayesian linear model (model 2) to estimate a 95% Confidence Interval on standardized effect size change in IAT D2 scores between Source Valence conditions in the Deepfaked video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis.

  • Result: Accepted

H2b

Change in IAT D2 scores (i.e., between Source Valence conditions) induced by Deepfaked video content will be non-inferior to genuine content.

  • Result: Accepted. Deepfakes are 104.5% (95% CI [93.7, 118]) as effective as genuine content in establishing self-reported evaluations.

Behavioural intentions

Fit model

fit_exploratory_intentions <-
  brm(formula = mean_intentions ~ source_valence * experiment_condition, # no random effect for experiment as only exp 6 assessed intentions
      family = gaussian(),
      data    = data_after_exclusions,
      file    = "models/fit_exploratory_intentions",
      prior   = prior(normal(0, 10)),
      iter    = 10000,
      warmup  = 3000,
      control = list(adapt_delta = 0.99),  # to avoid divergent transitions
      chains  = 4,
      cores   = parallel::detectCores())

Inspect convergence

summary(fit_exploratory_intentions)
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: mean_intentions ~ source_valence * experiment_condition 
##    Data: data_after_exclusions (Number of observations: 204) 
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
##          total post-warmup samples = 28000
## 
## Population-Level Effects: 
##                                                      Estimate Est.Error
## Intercept                                               -1.64      0.14
## source_valencepositive                                   1.12      0.20
## experiment_conditiondeepfaked                           -1.70      0.21
## source_valencepositive:experiment_conditiondeepfaked     1.95      0.28
##                                                      l-95% CI u-95% CI Rhat
## Intercept                                               -1.92    -1.36 1.00
## source_valencepositive                                   0.72     1.52 1.00
## experiment_conditiondeepfaked                           -2.10    -1.29 1.00
## source_valencepositive:experiment_conditiondeepfaked     1.38     2.50 1.00
##                                                      Bulk_ESS Tail_ESS
## Intercept                                               13786    17130
## source_valencepositive                                  12197    15327
## experiment_conditiondeepfaked                           11750    16015
## source_valencepositive:experiment_conditiondeepfaked    10334    14695
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     1.01      0.05     0.91     1.11 1.00    18031    17126
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit_exploratory_intentions, ask = FALSE)

Check informativeness of prior

Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.

check_prior(fit_exploratory_intentions) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter Prior_Quality
b_Intercept uninformative
b_source_valencepositive uninformative
b_experiment_conditiondeepfaked uninformative
b_source_valencepositive.experiment_conditiondeepfaked uninformative

Interpret posteriors

#plot_model(fit_exploratory_intentions)
plot_model(fit_exploratory_intentions, type = "pred", terms = c("source_valence", "experiment_condition"))

# percent moderation
draws_intentions <-
  bind_cols(
    select(spread_draws(fit_exploratory_intentions, b_source_valencepositive), b_source_valencepositive),
    select(spread_draws(fit_exploratory_intentions, b_experiment_conditiondeepfaked), b_experiment_conditiondeepfaked),
    select(spread_draws(fit_exploratory_intentions, `b_source_valencepositive:experiment_conditiondeepfaked`), `b_source_valencepositive:experiment_conditiondeepfaked`)
  ) %>%
  rename(main_valence = b_source_valencepositive,
         main_experiment_condition = b_experiment_conditiondeepfaked,
         interaction = `b_source_valencepositive:experiment_conditiondeepfaked`) %>%
  mutate(effect_genuine = main_valence,
         effect_deepfaked = main_valence + main_experiment_condition + interaction,
         #percent_moderation = (main_experiment_condition + interaction)/main_valence *100,  # alt method, same result
         percent_comparison = (effect_deepfaked/effect_genuine)*100)

# results table
estimates_intentions <-
  map_estimate(draws_intentions) %>%
  full_join(bayestestR::hdi(draws_intentions, ci = .95) %>%
              rename(CI_95_lower = CI_low,
                     CI_95_upper = CI_high) %>%
              as_tibble(),
            by = "Parameter") %>%
  full_join(bayestestR::hdi(draws_intentions, ci = .90) %>%
              as_tibble() %>%
              rename(CI_90_lower = CI_low,
                     CI_90_upper = CI_high),
            by = "Parameter") %>%
  full_join(draws_intentions %>%
              select(-percent_comparison) %>%
              gather(Parameter, value) %>%
              group_by(Parameter) %>%
              summarize(pd = mean(value > 0)) %>%
              mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
              ungroup() %>%
              select(Parameter, p),
            by = "Parameter") %>%
  select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper,
         CI_90_lower, CI_90_upper, p)

estimates_intentions %>%
  mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter MAP_Estimate CI_95_lower CI_95_upper CI_90_lower CI_90_upper p
main_valence 1.11 0.73 1.53 0.79 1.46 0
main_experiment_condition -1.70 -2.09 -1.28 -2.04 -1.36 0
interaction 1.95 1.39 2.50 1.48 2.41 0
effect_genuine 1.11 0.73 1.53 0.79 1.46 0
effect_deepfaked 1.37 0.99 1.76 1.05 1.69 0
percent_comparison 118.40 85.90 168.87 91.34 158.40 /
# hypothesis testing
H1e <- ifelse((estimates_intentions %>% filter(Parameter == "effect_genuine") %>% pull(CI_95_lower)) > 0, 
              "Accepted", "Rejected")

H1f <- ifelse((estimates_intentions %>% filter(Parameter == "effect_deepfaked") %>% pull(CI_95_lower)) > 0, 
              "Accepted", "Rejected")

H2c <- ifelse((estimates_intentions %>% filter(Parameter == "effect_deepfaked") %>% pull(CI_90_lower)) > 
                (estimates_intentions %>% filter(Parameter == "effect_genuine") %>% pull(CI_95_lower)), 
              "Accepted", "Rejected")

comparison_string_intentions <-
  paste0("Deepfakes are ",
         estimates_intentions %>% filter(Parameter == "percent_comparison") %>% pull(MAP_Estimate) %>% round(1), 
         "% (95% CI [",
         estimates_intentions %>% filter(Parameter == "percent_comparison") %>% pull(CI_95_lower) %>% round(1),
         ", ",
         estimates_intentions %>% filter(Parameter == "percent_comparison") %>% pull(CI_95_upper) %>% round(1),
         "]) as effective as genuine content in establishing self-reported evaluations")

H1e

The content of the genuine videos (i.e., Source Valence) will influence participants’ behavioral intention responses. Specifically, we will use a Bayesian linear model (model 2) to estimate a 95% Confidence Interval on standardized effect size change in behavioral intention scores between Source Valence conditions in the genuine video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis.

  • Result: Accepted

H1f

The content of the Deepfaked videos (i.e., Source Valence) will influence participants’ behavioral intention responses. Specifically, we will use a Bayesian linear model (model 2) to estimate a 95% Confidence Interval on standardized effect size change in behavioral intention scores between Source Valence conditions in the Deepfaked video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis.

  • Result: Accepted

H2c

Change in behavioral intentions (i.e., between Source Valence conditions) induced by Deepfaked video content will be non-inferior to genuine content.

  • Result: Accepted. Deepfakes are 118.4% (95% CI [85.9, 168.9]) as effective as genuine content in establishing self-reported evaluations.

RQ3: How good are people at detecting deepfakes?

Inter-rater relibility

data_after_exclusions %>%
  count(deepfake_detection_open_recoded,
        deepfake_detection_rater_1,
        deepfake_detection_rater_2) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
deepfake_detection_open_recoded deepfake_detection_rater_1 deepfake_detection_rater_2 n
FALSE FALSE FALSE 414
FALSE FALSE TRUE 16
FALSE TRUE FALSE 31
TRUE TRUE TRUE 115
/ / / 770
data_after_exclusions %>%
  summarize(percent_agreement = round(mean(deepfake_detection_rater_1 == deepfake_detection_rater_2, na.rm = TRUE)*100, 1)) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
percent_agreement
91.8
data_after_exclusions %>%
  select(deepfake_detection_rater_1,                                   
         deepfake_detection_rater_2) %>%
  as.data.frame() %>%  # kappa function won't take tibbles
  psych::cohen.kappa(.)
## Call: cohen.kappa1(x = x, w = w, n.obs = n.obs, alpha = alpha, levels = levels)
## 
## Cohen Kappa and Weighted Kappa correlation coefficients and confidence boundaries 
##                  lower estimate upper
## unweighted kappa  0.72     0.78  0.84
## weighted kappa    0.72     0.78  0.84
## 
##  Number of subjects = 576

Interpretation of Kappa (Altman 1999, Landis JR, 1977):

  • 0.61 - 0.80 Substantial
  • 0.81 - 1.00 Almost perfect

Can people accurately detect deepfakes?

  • Youden’s J = sensitivity + specificity - 1, aka informedness, aka “the probability of an informed decision (as opposed to a random guess) and takes into account all predictions”
  • 95% CIs were bootstrapped via case removal and the percentile method.

Sample size

data_after_exclusions %>%
  count(experiment_condition,
        deepfake_detection_open_recoded) %>%
  drop_na() %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
experiment_condition deepfake_detection_open_recoded n
genuine FALSE 188
genuine TRUE 16
deepfaked FALSE 273
deepfaked TRUE 99

Classification stats

data_counts <- data_after_exclusions %>%
  select(experiment_condition, deepfake_detection_open_recoded) %>%
  drop_na() %>%
  count(experiment_condition, deepfake_detection_open_recoded)

TP <- pull(filter(data_counts, experiment_condition == "deepfaked" & deepfake_detection_open_recoded == TRUE),  n)
FP <- pull(filter(data_counts, experiment_condition == "genuine"   & deepfake_detection_open_recoded == TRUE),  n)
FN <- pull(filter(data_counts, experiment_condition == "deepfaked" & deepfake_detection_open_recoded == FALSE), n)
TN <- pull(filter(data_counts, experiment_condition == "genuine"   & deepfake_detection_open_recoded == FALSE), n)

#accuracy <- (TP+TN)/(TP+TN+FP+FN)
sensitivity <- TP / (TP+FN)
false_negative_rate <- 1 - sensitivity
specificity <- TN / (TN+FP)
false_positive_rate <- 1 - specificity

# Youden's J statistic aka informedness aka "the probability of an informed decision (as opposed to a random guess) and takes into account all predictions". a zero value when a diagnostic test gives the same proportion of positive results for groups with and without the disease, i.e the test is useless.
informedness <- sensitivity + specificity - 1

balanced_accuracy <- (sensitivity + specificity)/2

fit_exploratory_classification <-
  tibble(variable = c(
    #"accuracy",
    "balanced_accuracy",
    "informedness",
    #"sensitivity",
    "false_negative_rate",
    #"specificity",
    "false_positive_rate"
  ),
  observed = c(
    #accuracy,
    balanced_accuracy,
    informedness,
    #sensitivity,
    false_negative_rate,
    #specificity,
    false_positive_rate
  ))

Bootstrapped classification stats

if(file.exists("models/fit_exploratory_classification_bootstraps.rds")){

  fit_exploratory_classification_bootstraps <- read_rds("models/fit_exploratory_classification_bootstraps.rds")

} else {

  # create bootstraps using out of bag method. makes a df with values that are collapsed dfs.
  boots <- data_after_exclusions %>%
    select(experiment_condition, deepfake_detection_open_recoded) %>%
    drop_na() %>%
    bootstraps(times = 2000)

  # generalize to a summarize function ------
  bootstrap_categorization_stats <- function(split) {

    data_counts <- analysis(split) %>%
      count(experiment_condition, deepfake_detection_open_recoded)

    TP <- pull(filter(data_counts, experiment_condition == "deepfaked" & deepfake_detection_open_recoded == TRUE),  n)
    FP <- pull(filter(data_counts, experiment_condition == "genuine"   & deepfake_detection_open_recoded == TRUE),  n)
    FN <- pull(filter(data_counts, experiment_condition == "deepfaked" & deepfake_detection_open_recoded == FALSE), n)
    TN <- pull(filter(data_counts, experiment_condition == "genuine"   & deepfake_detection_open_recoded == FALSE), n)

    #accuracy <- (TP+TN)/(TP+TN+FP+FN)
    sensitivity <- TP / (TP+FN)
    false_negative_rate <- 1 - sensitivity
    specificity <- TN / (TN+FP)
    false_positive_rate <- 1 - specificity

    # Youden's J statistic aka informedness aka "the probability of an informed decision (as opposed to a random guess) and takes into account all predictions". a zero value when a diagnostic test gives the same proportion of positive results for groups with and without the disease, i.e the test is useless.
    informedness <- sensitivity + specificity - 1

    balanced_accuracy <- (sensitivity + specificity)/2

    results <-
      tibble(variable = c(
        #"accuracy",
        "balanced_accuracy",
        "informedness",
        #"sensitivity",
        "false_negative_rate",
        #"specificity",
        "false_positive_rate"
      ),
      value = c(
        #accuracy,
        balanced_accuracy,
        informedness,
        #sensitivity,
        false_negative_rate,
        #specificity,
        false_positive_rate
      ))

    return(results)
  }

  # apply to each bootstrap
  fit_exploratory_classification_bootstraps <- boots %>%
    mutate(categorization_stats = future_map(splits, bootstrap_categorization_stats)) %>%
    select(-splits) %>%
    unnest(categorization_stats)

  write_rds(fit_exploratory_classification_bootstraps, "models/fit_exploratory_classification_bootstraps.rds")

}

Results

classifications <- fit_exploratory_classification_bootstraps %>%
  group_by(variable) %>%
  summarize(ci_lower = quantile(value, 0.025),
            ci_upper = quantile(value, 0.975),
            .groups  = "drop") %>%
  full_join(fit_exploratory_classification, by = "variable") %>%
  mutate_if(is.numeric, round, digits = 2) %>%
  select(variable, observed, ci_lower, ci_upper) 

classifications %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE)
variable observed ci_lower ci_upper
balanced_accuracy 0.59 0.56 0.62
false_negative_rate 0.73 0.69 0.78
false_positive_rate 0.08 0.04 0.12
informedness 0.19 0.13 0.25

H3: Participants are poor at making accurate and informed judgements about whether online video content is genuine or Deepfaked. Our predictions here are descriptive/continuous rather than involving cut-off based inference rules.

H3a

We expect participants to be poor at correctly detecting Deepfakes (i.e., demonstrate a high false negative rate, FNR ≳ .80).

H3b

We expect participants to incorrectly detect Deepfakes even when the video content was real (i.e., demonstrate a high false positive rate, FPR ≳ .05).

H3c

We expect participants to be poor at making accurate decisions about whether content is genuine or not (i.e., balanced accuracy not greatly above chance, ≲ .60).

H3d

We expect participants to make poorly informed decisions about whether content is genuine or not (i.e., informedness/Youden’s J ≲ .25).

Even the subset of participants who were aware of the concept of deepfakes before the study?

Sample size

data_after_exclusions %>%
  filter(deepfake_awareness_open_recoded == TRUE) %>%
  count(experiment_condition,
        deepfake_detection_open_recoded) %>%
  drop_na() %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
experiment_condition deepfake_detection_open_recoded n
genuine FALSE 96
genuine TRUE 10
deepfaked FALSE 82
deepfaked TRUE 32

Classification stats

data_counts_subset <- data_after_exclusions %>%
  select(experiment_condition, deepfake_detection_open_recoded) %>%
  drop_na() %>%
  count(experiment_condition, deepfake_detection_open_recoded)

TP <- pull(filter(data_counts_subset, experiment_condition == "deepfaked" & deepfake_detection_open_recoded == TRUE),  n)
FP <- pull(filter(data_counts_subset, experiment_condition == "genuine"   & deepfake_detection_open_recoded == TRUE),  n)
FN <- pull(filter(data_counts_subset, experiment_condition == "deepfaked" & deepfake_detection_open_recoded == FALSE), n)
TN <- pull(filter(data_counts_subset, experiment_condition == "genuine"   & deepfake_detection_open_recoded == FALSE), n)

#accuracy <- (TP+TN)/(TP+TN+FP+FN)
sensitivity <- TP / (TP+FN)
false_negative_rate <- 1 - sensitivity
specificity <- TN / (TN+FP)
false_positive_rate <- 1 - specificity

# Youden's J statistic aka informedness aka "the probability of an informed decision (as opposed to a random guess) and takes into account all predictions". a zero value when a diagnostic test gives the same proportion of positive results for groups with and without the disease, i.e the test is useless.
informedness <- sensitivity + specificity - 1

balanced_accuracy <- (sensitivity + specificity)/2

fit_exploratory_classification_subset <-
  tibble(variable = c(
    #"accuracy",
    "balanced_accuracy",
    "informedness",
    #"sensitivity",
    "false_negative_rate",
    #"specificity",
    "false_positive_rate"
  ),
  observed = c(
    #accuracy,
    balanced_accuracy,
    informedness,
    #sensitivity,
    false_negative_rate,
    #specificity,
    false_positive_rate
  ))

Bootstrapped classification stats

if(file.exists("models/fit_exploratory_classification_bootstraps_subset.rds")){

  fit_exploratory_classification_bootstraps_subset <- read_rds("models/fit_exploratory_classification_bootstraps_subset.rds")

} else {

  # create bootstraps using out of bag method. makes a df with values that are collapsed dfs.
  boots <- data_after_exclusions %>%
    filter(deepfake_awareness_open_recoded == TRUE) %>%
    select(experiment_condition, deepfake_detection_open_recoded) %>%
    drop_na() %>%
    bootstraps(times = 2000)

  # generalize to a summarize function ------
  bootstrap_categorization_stats <- function(split) {

    data_counts <- analysis(split) %>%
      count(experiment_condition, deepfake_detection_open_recoded)

    TP <- pull(filter(data_counts, experiment_condition == "deepfaked" & deepfake_detection_open_recoded == TRUE),  n)
    FP <- pull(filter(data_counts, experiment_condition == "genuine"   & deepfake_detection_open_recoded == TRUE),  n)
    FN <- pull(filter(data_counts, experiment_condition == "deepfaked" & deepfake_detection_open_recoded == FALSE), n)
    TN <- pull(filter(data_counts, experiment_condition == "genuine"   & deepfake_detection_open_recoded == FALSE), n)

    #accuracy <- (TP+TN)/(TP+TN+FP+FN)
    sensitivity <- TP / (TP+FN)
    false_negative_rate <- 1 - sensitivity
    specificity <- TN / (TN+FP)
    false_positive_rate <- 1 - specificity

    # Youden's J statistic aka informedness aka "the probability of an informed decision (as opposed to a random guess) and takes into account all predictions". a zero value when a diagnostic test gives the same proportion of positive results for groups with and without the disease, i.e the test is useless.
    informedness <- sensitivity + specificity - 1

    balanced_accuracy <- (sensitivity + specificity)/2

    results <-
      tibble(variable = c(
        #"accuracy",
        "balanced_accuracy",
        "informedness",
        #"sensitivity",
        "false_negative_rate",
        #"specificity",
        "false_positive_rate"
      ),
      value = c(
        #accuracy,
        balanced_accuracy,
        informedness,
        #sensitivity,
        false_negative_rate,
        #specificity,
        false_positive_rate
      ))

    return(results)
  }

  # apply to each bootstrap
  fit_exploratory_classification_bootstraps_subset <- boots %>%
    mutate(categorization_stats = future_map(splits, bootstrap_categorization_stats)) %>%
    select(-splits) %>%
    unnest(categorization_stats)

  write_rds(fit_exploratory_classification_bootstraps_subset, "models/fit_exploratory_classification_bootstraps_subset.rds")

}

Results

classifications_subset <- fit_exploratory_classification_bootstraps_subset %>%
  group_by(variable) %>%
  summarize(ci_lower = quantile(value, 0.025),
            ci_upper = quantile(value, 0.975),
            .groups  = "drop") %>%
  full_join(fit_exploratory_classification_subset, by = "variable") %>%
  mutate_if(is.numeric, round, digits = 2) %>%
  select(variable, observed, ci_lower, ci_upper) 

classifications_subset %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE)
variable observed ci_lower ci_upper
balanced_accuracy 0.59 0.54 0.64
false_negative_rate 0.73 0.64 0.80
false_positive_rate 0.08 0.05 0.15
informedness 0.19 0.08 0.28

RQ4: Are people aware that content can be Deepfaked before they take part in the study, and does this make them better at detecting them?

Percent of participants awareness of the concept prior to study

I.e., using the full sample and reporting the sample percentage.

Description of sample:

percent_aware <- data_after_exclusions %>%
  dplyr::select(deepfake_awareness_open_recoded) %>%
  drop_na() %>%
  count(deepfake_awareness_open_recoded) %>%
  mutate(counts = n,
         awareness = as.factor(deepfake_awareness_open_recoded),
         percent_aware = round(counts/sum(counts)*100, 1)) %>%
  filter(awareness == "TRUE") %>%
  dplyr::select(percent_aware) 

percent_aware %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
percent_aware
53.5

In the subset of participants who were shown a deepfake, did prior awareness make them more likely to detect it?

Putting aside true negatives and false positive, does prior awareness of the concept of deepfaking at least make people better at detecting deepfakes

It would of course be possible include data from both experiment_conditions and add it to the model, however interpreting the two and three way interactions is less intuitive. Given this question is of secondary importance, I we therefore elected for the simpler analysis focusing on awareness and the FNR/TPR.

Fit model

# convert data to counts
data_counts_awareness_detection <- data_after_exclusions %>%
  filter(experiment_condition == "deepfaked") %>%
  dplyr::select(experiment, deepfake_awareness_open_recoded, deepfake_detection_open_recoded) %>%
  drop_na() %>%
  count(experiment, deepfake_awareness_open_recoded, deepfake_detection_open_recoded) %>%
  group_by(experiment) %>%
  mutate(counts = n,
         awareness = as.factor(deepfake_awareness_open_recoded),
         detection = as.factor(deepfake_detection_open_recoded),
         proportion = counts/sum(counts)) %>%
  ungroup() %>%
  dplyr::select(experiment, awareness, detection, counts, proportion)

# total counts is needed later to convert to proportions
total_counts_awareness_detection <- data_counts_awareness_detection %>%
  group_by(experiment) %>%
  summarize(total = sum(counts)) 

# fit poisson model
fit_exploratory_poisson_awareness_detection <- 
  brm(formula = counts ~ 1 + awareness * detection + (1 | experiment),
      family  = poisson(),
      data    = data_counts_awareness_detection,
      file    = "models/fit_exploratory_poisson_awareness_detection",
      prior   = prior(normal(0, 10)),
      iter    = 10000,
      warmup  = 3000,
      control = list(adapt_delta = 0.998,
                     max_treedepth = 18),  # to avoid divergent transitions
      chains  = 4,
      cores   = parallel::detectCores())

Inspect convergence

pp_check(fit_exploratory_poisson_awareness_detection, nsamples = 100)

summary(fit_exploratory_poisson_awareness_detection)
##  Family: poisson 
##   Links: mu = log 
## Formula: counts ~ 1 + awareness * detection + (1 | experiment) 
##    Data: data_counts_awareness_detection (Number of observations: 8) 
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
##          total post-warmup samples = 28000
## 
## Group-Level Effects: 
## ~experiment (Number of levels: 2) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.69      0.97     0.01     3.43 1.00     4885     9539
## 
## Population-Level Effects: 
##                             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                       3.71      0.65     2.33     5.14 1.00     6693
## awarenessTRUE                   0.01      0.16    -0.30     0.32 1.00    14248
## detectionTRUE                  -1.94      0.31    -2.59    -1.36 1.00    11346
## awarenessTRUE:detectionTRUE     0.99      0.38     0.26     1.74 1.00    10820
##                             Tail_ESS
## Intercept                       5754
## awarenessTRUE                  14911
## detectionTRUE                  12932
## awarenessTRUE:detectionTRUE    12992
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit_exploratory_poisson_awareness_detection, ask = FALSE)

Check informativeness of prior

Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.

check_prior(fit_exploratory_poisson_awareness_detection) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter Prior_Quality
b_Intercept uninformative
b_awarenessTRUE uninformative
b_detectionTRUE uninformative
b_awarenessTRUE.detectionTRUE uninformative

Interpret posteriors

plot_model(fit_exploratory_poisson_awareness_detection)

# plot(conditional_effects(fit_exploratory_poisson_awareness_detection), ask = FALSE)

Parameter estimates

# posterior draws for parameters (for results table)
draws_awareness_detection <- posterior_samples(fit_exploratory_poisson_awareness_detection) %>%
  dplyr::select(awarenessTRUE = b_awarenessTRUE, 
                detectionTRUE = b_detectionTRUE, 
                interaction = `b_awarenessTRUE:detectionTRUE`) 

estimates_awareness_detection <- 
  full_join(as_tibble(map_estimate(draws_awareness_detection)),
            as_tibble(bayestestR::hdi(draws_awareness_detection, ci = .95)), 
          by = "Parameter") %>%
  # exponentiate the log IRR values to IRR
  mutate_if(is.numeric, exp) %>%
  full_join(draws_awareness_detection %>%
              gather(Parameter, value) %>%
              group_by(Parameter) %>%
              summarize(pd = mean(exp(value) > 1)) %>%
              mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
              ungroup() %>%
              select(Parameter, p),
            by = "Parameter") %>%
  dplyr::select(Parameter, incidence_rate_ratio_MAP = MAP_Estimate, CI_95_lower = CI_low, CI_95_upper = CI_high, p) 
  # convert from odds to probability
  # mutate_if(is.numeric, function(x){x/(1+x)}) %>%
  
# table
estimates_awareness_detection %>%
  mutate_at(vars("incidence_rate_ratio_MAP", "CI_95_lower", "CI_95_upper"), round, digits = 2) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter incidence_rate_ratio_MAP CI_95_lower CI_95_upper p
awarenessTRUE 1.01 0.75 1.38 0.4682143
detectionTRUE 0.15 0.08 0.26 0.0000000
interaction 2.58 1.27 5.59 0.0034643
# hypothesis testing
H4 <- ifelse((estimates_awareness_detection %>% filter(Parameter == "interaction") %>% pull(CI_95_lower)) > 1, 
              "Accepted", "Rejected")

comparison_string_awareness_detection <-
  paste0("Individuals who were aware of the concept of Deepfakes prior to participating in the study were ",
         estimates_awareness_detection %>% filter(Parameter == "interaction") %>% pull(incidence_rate_ratio_MAP) %>% round(1), 
         " times more likely to detect that they had been shown a deepfake than those who were not aware of the concept (Incidence Rate Ratio = ",
         estimates_awareness_detection %>% filter(Parameter == "interaction") %>% pull(incidence_rate_ratio_MAP) %>% round(2), 
         ", 95% CI [",
         estimates_awareness_detection %>% filter(Parameter == "interaction") %>% pull(CI_95_lower) %>% round(2),
         ", ",
         estimates_awareness_detection %>% filter(Parameter == "interaction") %>% pull(CI_95_upper) %>% round(2),
         "])")

H4

Participants who reporting being aware of the concept of Deepfakes prior to participating in the experiment will be better at detectin Deepfakes when presented with one. Specifically, we will use participants in the Deepfake condition to calculate cell counts for the Deepfake concept check and Deepfake detection questions (i.e., what number of participants were observed for each combination of those variables). We will then use a Bayesian Poisson model (model 5) to estimate a 95% Confidence Interval on the Incidence Rate Ratio for the interaction effect between Deepfakes concept awareness and deepfake detection. A Confidence Interval whose lower bound is > 1 will be considered evidence in support of this hypothesis.

  • Result: Accepted
  • Individuals who were aware of the concept of Deepfakes prior to participating in the study were 2.6 times more likely to detect that they had been shown a deepfake than those who were not aware of the concept (Incidence Rate Ratio = 2.58, 95% CI [1.27, 5.59])

Predicted probabilities

posterior_predictions_awareness_detection <-
  tibble(experiment = c("5", "6"),
         awareness = c("TRUE", "FALSE"),
         detection = c("TRUE", "FALSE")) %>%
  data_grid(experiment, awareness, detection) %>%
  add_predicted_draws(model = fit_exploratory_poisson_awareness_detection, re_formula = NULL) %>%
  rename(predicted_count = .prediction) %>%
  left_join(total_counts_awareness_detection, by = "experiment") %>%
  mutate(predicted_probabiity = predicted_count/total) %>%
  ungroup() %>%
  dplyr::select(experiment, awareness, detection, predicted_count, predicted_probabiity) 


posterior_predictions_awareness_detection_aT_dT <- posterior_predictions_awareness_detection %>% 
  filter(awareness == "TRUE" & detection == "TRUE")
posterior_predictions_awareness_detection_aT_dF <- posterior_predictions_awareness_detection %>% 
  filter(awareness == "TRUE" & detection == "FALSE")
posterior_predictions_awareness_detection_aF_dT <- posterior_predictions_awareness_detection %>% 
  filter(awareness == "FALSE" & detection == "TRUE")
posterior_predictions_awareness_detection_aF_dF <- posterior_predictions_awareness_detection %>% 
  filter(awareness == "FALSE" & detection == "FALSE")


results_detection_probabilities <- 
  rbind(
    bind_cols(as_tibble(map_estimate(posterior_predictions_awareness_detection_aT_dT$predicted_probabiity)),
              as_tibble(bayestestR::hdi(posterior_predictions_awareness_detection_aT_dT$predicted_probabiity, 
                                        ci = .95))) %>%
      mutate(awareness = "TRUE", detection = "TRUE"),
    bind_cols(as_tibble(map_estimate(posterior_predictions_awareness_detection_aT_dF$predicted_probabiity)),
              as_tibble(bayestestR::hdi(posterior_predictions_awareness_detection_aT_dF$predicted_probabiity, 
                                        ci = .95))) %>%
      mutate(awareness = "TRUE", detection = "FALSE"),
    bind_cols(as_tibble(map_estimate(posterior_predictions_awareness_detection_aF_dT$predicted_probabiity)),
              as_tibble(bayestestR::hdi(posterior_predictions_awareness_detection_aF_dT$predicted_probabiity, 
                                        ci = .95))) %>%
      mutate(awareness = "FALSE", detection = "TRUE"),
    bind_cols(as_tibble(map_estimate(posterior_predictions_awareness_detection_aF_dF$predicted_probabiity)),
              as_tibble(bayestestR::hdi(posterior_predictions_awareness_detection_aF_dF$predicted_probabiity, 
                                        ci = .95))) %>%
      mutate(awareness = "FALSE", detection = "FALSE")
  ) %>%
  dplyr::select(awareness, detection, detection_probability_MAP = value, 
                CI_95_lower = CI_low, CI_95_upper = CI_high) %>%
  mutate_if(is.numeric, round, digits = 3) 

results_detection_probabilities %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
awareness detection detection_probability_MAP CI_95_lower CI_95_upper
TRUE TRUE 0.144 0.067 0.250
TRUE FALSE 0.379 0.260 0.567
FALSE TRUE 0.058 0.010 0.115
FALSE FALSE 0.390 0.231 0.538
  • Probability of detecting deepfake if unaware: 0.058
  • Probability of detecting deepfake if aware: 0.144

RQ5: Does knowing about the concept of Deepfaking make you immune to its influence?

Subset who received deepfaked videos and were aware of the concept prior to the experiment. Same Bayesian multilevel models as employed above, using only source_valence as IV, i.e., to detect whether learning effects are credibly non-zero in this subset.

Sample sizes

data_aware_subset_n <- data_after_exclusions %>%
  filter(experiment_condition == "deepfaked" & !is.na(deepfake_awareness_open_recoded)) %>%
  count(deepfake_awareness_open_recoded) %>%
  mutate(proportion = round(n/sum(n), 2)) %>%
  arrange(desc(proportion))

data_aware_subset_n %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
deepfake_awareness_open_recoded n proportion
TRUE 114 0.55
FALSE 93 0.45
data_aware_subset <- data_after_exclusions %>%
  filter(experiment_condition == "deepfaked" & deepfake_awareness_open_recoded == TRUE)

Self-reported evaluations

Fit model

fit_exploratory_selfreport_deepfaked_aware <-
  brm(formula = mean_self_reported_evaluation ~ source_valence + (1 | experiment),
      family  = gaussian(),
      data    = data_aware_subset,
      file    = "models/fit_exploratory_selfreport_deepfaked_aware",
      prior   = prior(normal(0, 10)),
      iter    = 10000,
      warmup  = 3000,
      control = list(adapt_delta = 0.99),  # to avoid divergent transitions
      chains  = 4,
      cores   = parallel::detectCores())

Inspect convergence

summary(fit_exploratory_selfreport_deepfaked_aware)
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: mean_self_reported_evaluation ~ source_valence + (1 | experiment) 
##    Data: data_aware_subset (Number of observations: 114) 
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
##          total post-warmup samples = 28000
## 
## Group-Level Effects: 
## ~experiment (Number of levels: 2) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.78      0.98     0.01     3.40 1.00     4823     9909
## 
## Population-Level Effects: 
##                        Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                 -1.60      0.62    -3.00    -0.19 1.00     6824
## source_valencepositive     3.07      0.20     2.67     3.47 1.00    17063
##                        Tail_ESS
## Intercept                  5694
## source_valencepositive    16352
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     1.07      0.07     0.94     1.23 1.00    16528    15469
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit_exploratory_selfreport_deepfaked_aware, ask = FALSE)

Check informativeness of prior

Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.

check_prior(fit_exploratory_selfreport_deepfaked_aware) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter Prior_Quality
b_Intercept uninformative
b_source_valencepositive uninformative

Interpret posteriors

#plot_model(fit_exploratory_selfreport_deepfaked_aware)
plot_model(fit_exploratory_selfreport_deepfaked_aware, type = "pred", terms = "source_valence")

# results table
draws_sr_deepfaked_aware <-
  select(spread_draws(fit_exploratory_selfreport_deepfaked_aware, b_source_valencepositive), b_source_valencepositive) %>%
  rename(effect_deepfaked_aware = b_source_valencepositive)

estimates_sr_deepfaked_aware <-
  map_estimate(draws_sr_deepfaked_aware) %>%
  full_join(bayestestR::hdi(draws_sr_deepfaked_aware, ci = .95) %>%
              rename(CI_95_lower = CI_low,
                     CI_95_upper = CI_high) %>%
              as_tibble(),
            by = "Parameter") %>%
  full_join(bayestestR::hdi(draws_sr_deepfaked_aware, ci = .90) %>%
              as_tibble() %>%
              rename(CI_90_lower = CI_low,
                     CI_90_upper = CI_high),
            by = "Parameter") %>%
  full_join(draws_sr_deepfaked_aware %>%
              gather(Parameter, value) %>%
              group_by(Parameter) %>%
              summarize(pd = mean(value > 0)) %>%
              mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
              ungroup() %>%
              select(Parameter, p),
            by = "Parameter") %>%
  select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper, 
         CI_90_lower, CI_90_upper, p) 

bind_rows(filter(estimates_sr, Parameter %in% c("effect_deepfaked")),
          estimates_sr_deepfaked_aware) %>%
  mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter MAP_Estimate CI_95_lower CI_95_upper CI_90_lower CI_90_upper p
effect_deepfaked 2.80 2.63 2.96 2.66 2.93 0
effect_deepfaked_aware 3.08 2.68 3.48 2.73 3.40 0
# hypothesis testing
H5a <- ifelse((estimates_sr_deepfaked_aware %>% filter(Parameter == "effect_deepfaked_aware") %>%
                 pull(CI_95_lower)) > 0, 
              "Accepted", "Rejected")

H5a

In the subset of participants who were shown a Deepfaked video and accurately detected that the video was Deepfaked, the content of the Deepfaked videos (i.e., Source Valence) will influence participants’ self-reported evaluations. Specifically, we will use a Bayesian linear model (model 3) to estimate a 95% Confidence Interval on standardized effect size change in self-reported evaluations between Source Valence conditions in the genuine video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis. ######### needs editing #########

  • Result: Accepted

Implicit

Fit model

fit_exploratory_implicit_deepfaked_aware <-
  brm(formula = IAT_D2 ~ source_valence + (1 | experiment),
      family  = gaussian(),
      data    = data_aware_subset,
      file    = "models/fit_exploratory_implicit_deepfaked_aware",
      prior   = prior(normal(0, 10)),
      iter    = 10000,
      warmup  = 3000,
      control = list(adapt_delta = 0.99),  # to avoid divergent transitions
      chains  = 4,
      cores   = parallel::detectCores())

Inspect convergence

summary(fit_exploratory_implicit_deepfaked_aware)
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: IAT_D2 ~ source_valence + (1 | experiment) 
##    Data: data_aware_subset (Number of observations: 114) 
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
##          total post-warmup samples = 28000
## 
## Group-Level Effects: 
## ~experiment (Number of levels: 2) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.76      1.00     0.01     3.54 1.00     4880     8892
## 
## Population-Level Effects: 
##                        Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                 -0.09      0.67    -1.60     1.42 1.00     6140
## source_valencepositive     1.39      0.19     1.02     1.75 1.00    16554
##                        Tail_ESS
## Intercept                  6124
## source_valencepositive    15509
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     0.99      0.07     0.87     1.13 1.00    16744    15952
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit_exploratory_implicit_deepfaked_aware, ask = FALSE)

Check informativeness of prior

Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.

check_prior(fit_exploratory_implicit_deepfaked_aware) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter Prior_Quality
b_Intercept uninformative
b_source_valencepositive uninformative

Interpret posteriors

#plot_model(fit_exploratory_implicit_deepfaked_aware)
plot_model(fit_exploratory_implicit_deepfaked_aware, type = "pred", terms = "source_valence")

# results table
draws_imp_deepfaked_aware <-
  select(spread_draws(fit_exploratory_implicit_deepfaked_aware, b_source_valencepositive), b_source_valencepositive) %>%
  rename(effect_deepfaked_aware = b_source_valencepositive)

estimates_imp_deepfaked_aware <-
  map_estimate(draws_imp_deepfaked_aware) %>%
  full_join(bayestestR::hdi(draws_imp_deepfaked_aware, ci = .95) %>%
              rename(CI_95_lower = CI_low,
                     CI_95_upper = CI_high) %>%
              as_tibble(),
            by = "Parameter") %>%
  full_join(bayestestR::hdi(draws_imp_deepfaked_aware, ci = .90) %>%
              as_tibble() %>%
              rename(CI_90_lower = CI_low,
                     CI_90_upper = CI_high),
            by = "Parameter") %>%
  full_join(draws_imp_deepfaked_aware %>%
              gather(Parameter, value) %>%
              group_by(Parameter) %>%
              summarize(pd = mean(value > 0)) %>%
              mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
              ungroup() %>%
              select(Parameter, p),
            by = "Parameter") %>%
  select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper, 
         CI_90_lower, CI_90_upper, p) 

bind_rows(filter(estimates_imp, Parameter %in% c("effect_deepfaked")),
          estimates_imp_deepfaked_aware) %>%
  mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter MAP_Estimate CI_95_lower CI_95_upper CI_90_lower CI_90_upper p
effect_deepfaked 1.41 1.23 1.55 1.26 1.53 0
effect_deepfaked_aware 1.38 1.02 1.75 1.08 1.69 0
# hypothesis testing
H5b <- ifelse((estimates_imp_deepfaked_aware %>% filter(Parameter == "effect_deepfaked_aware") %>%
                 pull(CI_95_lower)) > 0, 
              "Accepted", "Rejected")

H5b

In the subset of participants who were shown a Deepfaked video and accurately detected that the video was Deepfaked, the content of the Deepfaked videos (i.e., Source Valence) will influence participants’ IAT D2 scores. Specifically, we will use a Bayesian linear model (model 4) to estimate a 95% Confidence Interval on standardized effect size change in IAT D2 scores between Source Valence conditions in the Deepfaked video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis. ######### needs editing #########

  • Result: Accepted

Behavioural intentions

Fit model

fit_exploratory_intentions_deepfaked_aware <-
  brm(formula = mean_intentions ~ source_valence, # no random effect for experiment as only exp 6 assessed intentions
      family  = gaussian(),
      data    = data_aware_subset,
      file    = "models/fit_exploratory_intentions_deepfaked_aware",
      prior   = prior(normal(0, 10)),
      iter    = 10000,
      warmup  = 3000,
      control = list(adapt_delta = 0.99),  # to avoid divergent transitions
      chains  = 4,
      cores   = parallel::detectCores())

Inspect convergence

summary(fit_exploratory_intentions_deepfaked_aware)
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: mean_intentions ~ source_valence 
##    Data: data_aware_subset (Number of observations: 53) 
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
##          total post-warmup samples = 28000
## 
## Population-Level Effects: 
##                        Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                 -3.31      0.22    -3.75    -2.87 1.00    21104
## source_valencepositive     3.09      0.29     2.52     3.66 1.00    20531
##                        Tail_ESS
## Intercept                 16876
## source_valencepositive    16552
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     1.04      0.11     0.86     1.28 1.00    21315    16915
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit_exploratory_intentions_deepfaked_aware, ask = FALSE)

Check informativeness of prior

Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.

check_prior(fit_exploratory_intentions_deepfaked_aware) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter Prior_Quality
b_Intercept uninformative
b_source_valencepositive uninformative

Interpret posteriors

#plot_model(fit_exploratory_intentions_deepfaked_aware)
plot_model(fit_exploratory_intentions_deepfaked_aware, type = "pred", terms = "source_valence")

# results table
draws_intentions_deepfaked_aware <-
  select(spread_draws(fit_exploratory_intentions_deepfaked_aware, b_source_valencepositive), b_source_valencepositive) %>%
  rename(effect_deepfaked_aware = b_source_valencepositive)

estimates_intentions_deepfaked_aware <-
  map_estimate(draws_intentions_deepfaked_aware) %>%
  full_join(bayestestR::hdi(draws_intentions_deepfaked_aware, ci = .95) %>%
              rename(CI_95_lower = CI_low,
                     CI_95_upper = CI_high) %>%
              as_tibble(),
            by = "Parameter") %>%
  full_join(bayestestR::hdi(draws_intentions_deepfaked_aware, ci = .90) %>%
              as_tibble() %>%
              rename(CI_90_lower = CI_low,
                     CI_90_upper = CI_high),
            by = "Parameter") %>%
  full_join(draws_intentions_deepfaked_aware %>%
              gather(Parameter, value) %>%
              group_by(Parameter) %>%
              summarize(pd = mean(value > 0)) %>%
              mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
              ungroup() %>%
              select(Parameter, p),
            by = "Parameter") %>%
  select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper, 
         CI_90_lower, CI_90_upper, p)

bind_rows(filter(estimates_intentions, Parameter %in% c("effect_deepfaked")),
          estimates_intentions_deepfaked_aware) %>%
  mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter MAP_Estimate CI_95_lower CI_95_upper CI_90_lower CI_90_upper p
effect_deepfaked 1.37 0.99 1.76 1.05 1.69 0
effect_deepfaked_aware 3.11 2.53 3.67 2.62 3.58 0
# hypothesis testing
H5c <- ifelse((estimates_intentions_deepfaked_aware %>% filter(Parameter == "effect_deepfaked_aware") %>%
                 pull(CI_95_lower)) > 0, 
              "Accepted", "Rejected")

H5c

In the subset of participants who were shown a Deepfaked video and accurately detected that the video was Deepfaked, the content of the Deepfaked videos (i.e., Source Valence) will influence participants’ behavioral intention scores. Specifically, we will use a Bayesian linear model (model 4) to estimate a 95% Confidence Interval on standardized effect size change in behavioral intention scores between Source Valence conditions in the Deepfaked video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis. ######### needs editing #########

  • Result: Accepted

RQ6: Does knowing content is Deepfaked make you immune to its influence?

Subset who received deepfaked videos but also detected them. Same Bayesian multilevel models as employed above, using only source_valence as IV, i.e., to detect whether learning effects are credibly non-zero in this subset.

Sample sizes

data_detectors_subset_n <- data_after_exclusions %>%
  filter(experiment_condition == "deepfaked" & !is.na(deepfake_detection_open_recoded)) %>%
  count(deepfake_detection_open_recoded) %>%
  mutate(proportion = round(n/sum(n), 2)) %>%
  arrange(desc(proportion))

data_detectors_subset_n %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
deepfake_detection_open_recoded n proportion
FALSE 273 0.73
TRUE 99 0.27
data_detectors_subset <- data_after_exclusions %>%
  filter(experiment_condition == "deepfaked" & deepfake_detection_open_recoded == TRUE)

Self-reported evaluations

Fit model

fit_exploratory_selfreport_deepfaked_detected <-
  brm(formula = mean_self_reported_evaluation ~ source_valence + (1 | experiment),
      family  = gaussian(),
      data    = data_detectors_subset,
      file    = "models/fit_exploratory_selfreport_deepfaked_detected",
      prior   = prior(normal(0, 10)),
      iter    = 10000,
      warmup  = 3000,
      control = list(adapt_delta = 0.99),  # to avoid divergent transitions
      chains  = 4,
      cores   = parallel::detectCores())

Inspect convergence

summary(fit_exploratory_selfreport_deepfaked_detected)
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: mean_self_reported_evaluation ~ source_valence + (1 | experiment) 
##    Data: data_detectors_subset (Number of observations: 99) 
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
##          total post-warmup samples = 28000
## 
## Group-Level Effects: 
## ~experiment (Number of levels: 3) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.42      0.50     0.01     1.84 1.00     4521     4742
## 
## Population-Level Effects: 
##                        Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                 -1.60      0.35    -2.33    -0.84 1.00     6354
## source_valencepositive     2.75      0.24     2.28     3.22 1.00    15987
##                        Tail_ESS
## Intercept                  4888
## source_valencepositive    16649
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     1.17      0.09     1.02     1.36 1.00    18604    17389
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit_exploratory_selfreport_deepfaked_detected, ask = FALSE)

Check informativeness of prior

Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.

check_prior(fit_exploratory_selfreport_deepfaked_detected) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter Prior_Quality
b_Intercept uninformative
b_source_valencepositive uninformative

Interpret posteriors

#plot_model(fit_exploratory_selfreport_deepfaked_detected)
plot_model(fit_exploratory_selfreport_deepfaked_detected, type = "pred", terms = "source_valence")

# results table
draws_sr_deepfaked_detected <-
  select(spread_draws(fit_exploratory_selfreport_deepfaked_detected, b_source_valencepositive), b_source_valencepositive) %>%
  rename(effect_deepfaked_detected = b_source_valencepositive)

estimates_sr_deepfaked_detected <-
  map_estimate(draws_sr_deepfaked_detected) %>%
  full_join(bayestestR::hdi(draws_sr_deepfaked_detected, ci = .95) %>%
              rename(CI_95_lower = CI_low,
                     CI_95_upper = CI_high) %>%
              as_tibble(),
            by = "Parameter") %>%
  full_join(bayestestR::hdi(draws_sr_deepfaked_detected, ci = .90) %>%
              as_tibble() %>%
              rename(CI_90_lower = CI_low,
                     CI_90_upper = CI_high),
            by = "Parameter") %>%
  full_join(draws_sr_deepfaked_detected %>%
              gather(Parameter, value) %>%
              group_by(Parameter) %>%
              summarize(pd = mean(value > 0)) %>%
              mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
              ungroup() %>%
              select(Parameter, p),
            by = "Parameter") %>%
  select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper, 
         CI_90_lower, CI_90_upper, p) 

bind_rows(filter(estimates_sr, Parameter %in% c("effect_deepfaked")),
          estimates_sr_deepfaked_detected) %>%
  mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter MAP_Estimate CI_95_lower CI_95_upper CI_90_lower CI_90_upper p
effect_deepfaked 2.80 2.63 2.96 2.66 2.93 0
effect_deepfaked_detected 2.74 2.29 3.23 2.35 3.13 0
# hypothesis testing
H6a <- ifelse((estimates_sr_deepfaked_detected %>% filter(Parameter == "effect_deepfaked_detected") %>%
                 pull(CI_95_lower)) > 0, 
              "Accepted", "Rejected")

H6a

In the subset of participants who were shown a Deepfaked video and accurately detected that the video was Deepfaked, the content of the Deepfaked videos (i.e., Source Valence) will influence participants’ self-reported evaluations. Specifically, we will use a Bayesian linear model (model 3) to estimate a 95% Confidence Interval on standardized effect size change in self-reported evaluations between Source Valence conditions in the genuine video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis.

  • Result: Accepted

Implicit

Fit model

fit_exploratory_implicit_deepfaked_detected <-
  brm(formula = IAT_D2 ~ source_valence + (1 | experiment),
      family  = gaussian(),
      data    = data_detectors_subset,
      file    = "models/fit_exploratory_implicit_deepfaked_detected",
      prior   = prior(normal(0, 10)),
      iter    = 10000,
      warmup  = 3000,
      control = list(adapt_delta = 0.99),  # to avoid divergent transitions
      chains  = 4,
      cores   = parallel::detectCores())

Inspect convergence

summary(fit_exploratory_implicit_deepfaked_detected)
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: IAT_D2 ~ source_valence + (1 | experiment) 
##    Data: data_detectors_subset (Number of observations: 99) 
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
##          total post-warmup samples = 28000
## 
## Group-Level Effects: 
## ~experiment (Number of levels: 3) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     1.05      0.86     0.25     3.31 1.00     6046     8652
## 
## Population-Level Effects: 
##                        Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                  0.06      0.66    -1.36     1.46 1.00     7078
## source_valencepositive     1.07      0.19     0.71     1.43 1.00    18476
##                        Tail_ESS
## Intercept                  7543
## source_valencepositive    14850
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     0.91      0.07     0.79     1.06 1.00    17701    13337
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit_exploratory_implicit_deepfaked_detected, ask = FALSE)

Check informativeness of prior

Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.

check_prior(fit_exploratory_implicit_deepfaked_detected) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter Prior_Quality
b_Intercept uninformative
b_source_valencepositive uninformative

Interpret posteriors

#plot_model(fit_exploratory_implicit_deepfaked_detected)
plot_model(fit_exploratory_implicit_deepfaked_detected, type = "pred", terms = "source_valence")

# results table
draws_imp_deepfaked_detected <-
  select(spread_draws(fit_exploratory_implicit_deepfaked_detected, b_source_valencepositive), b_source_valencepositive) %>%
  rename(effect_deepfaked_detected = b_source_valencepositive)

estimates_imp_deepfaked_detected <-
  map_estimate(draws_imp_deepfaked_detected) %>%
  full_join(bayestestR::hdi(draws_imp_deepfaked_detected, ci = .95) %>%
              rename(CI_95_lower = CI_low,
                     CI_95_upper = CI_high) %>%
              as_tibble(),
            by = "Parameter") %>%
  full_join(bayestestR::hdi(draws_imp_deepfaked_detected, ci = .90) %>%
              as_tibble() %>%
              rename(CI_90_lower = CI_low,
                     CI_90_upper = CI_high),
            by = "Parameter") %>%
  full_join(draws_imp_deepfaked_detected %>%
              gather(Parameter, value) %>%
              group_by(Parameter) %>%
              summarize(pd = mean(value > 0)) %>%
              mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
              ungroup() %>%
              select(Parameter, p),
            by = "Parameter") %>%
  select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper, 
         CI_90_lower, CI_90_upper, p) 

bind_rows(filter(estimates_imp, Parameter %in% c("effect_deepfaked")),
          estimates_imp_deepfaked_detected) %>%
  mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter MAP_Estimate CI_95_lower CI_95_upper CI_90_lower CI_90_upper p
effect_deepfaked 1.41 1.23 1.55 1.26 1.53 0
effect_deepfaked_detected 1.06 0.70 1.42 0.76 1.37 0
# hypothesis testing
H6b <- ifelse((estimates_imp_deepfaked_detected %>% filter(Parameter == "effect_deepfaked_detected") %>%
                 pull(CI_95_lower)) > 0, 
              "Accepted", "Rejected")

H6b

In the subset of participants who were shown a Deepfaked video and accurately detected that the video was Deepfaked, the content of the Deepfaked videos (i.e., Source Valence) will influence participants’ IAT D2 scores. Specifically, we will use a Bayesian linear model (model 4) to estimate a 95% Confidence Interval on standardized effect size change in IAT D2 scores between Source Valence conditions in the Deepfaked video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis.

  • Result: Accepted

Behavioural intentions

Fit model

fit_exploratory_intentions_deepfaked_detected <-
  brm(formula = mean_intentions ~ source_valence, # no random effect for experiment as only exp 6 assessed intentions
      family  = gaussian(),
      data    = data_detectors_subset,
      file    = "models/fit_exploratory_intentions_deepfaked_detected",
      prior   = prior(normal(0, 10)),
      iter    = 10000,
      warmup  = 3000,
      control = list(adapt_delta = 0.99),  # to avoid divergent transitions
      chains  = 4,
      cores   = parallel::detectCores())

Inspect convergence

summary(fit_exploratory_intentions_deepfaked_detected)
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: mean_intentions ~ source_valence 
##    Data: data_detectors_subset (Number of observations: 24) 
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
##          total post-warmup samples = 28000
## 
## Population-Level Effects: 
##                        Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                 -3.27      0.27    -3.80    -2.73 1.00    19083
## source_valencepositive     2.71      0.42     1.89     3.54 1.00    18613
##                        Tail_ESS
## Intercept                 14831
## source_valencepositive    15579
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     1.00      0.16     0.74     1.36 1.00    18144    13630
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit_exploratory_intentions_deepfaked_detected, ask = FALSE)

Check informativeness of prior

Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.

check_prior(fit_exploratory_intentions_deepfaked_detected) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter Prior_Quality
b_Intercept uninformative
b_source_valencepositive uninformative

Interpret posteriors

#plot_model(fit_exploratory_intentions_deepfaked_detected)
plot_model(fit_exploratory_intentions_deepfaked_detected, type = "pred", terms = "source_valence")

# results table
draws_intentions_deepfaked_detected <-
  select(spread_draws(fit_exploratory_intentions_deepfaked_detected, b_source_valencepositive), b_source_valencepositive) %>%
  rename(effect_deepfaked_detected = b_source_valencepositive)

estimates_intentions_deepfaked_detected <-
  map_estimate(draws_intentions_deepfaked_detected) %>%
  full_join(bayestestR::hdi(draws_intentions_deepfaked_detected, ci = .95) %>%
              rename(CI_95_lower = CI_low,
                     CI_95_upper = CI_high) %>%
              as_tibble(),
            by = "Parameter") %>%
  full_join(bayestestR::hdi(draws_intentions_deepfaked_detected, ci = .90) %>%
              as_tibble() %>%
              rename(CI_90_lower = CI_low,
                     CI_90_upper = CI_high),
            by = "Parameter") %>%
  full_join(draws_intentions_deepfaked_detected %>%
              gather(Parameter, value) %>%
              group_by(Parameter) %>%
              summarize(pd = mean(value > 0)) %>%
              mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
              ungroup() %>%
              select(Parameter, p),
            by = "Parameter") %>%
  select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper, 
         CI_90_lower, CI_90_upper, p)

bind_rows(filter(estimates_intentions, Parameter %in% c("effect_deepfaked")),
          estimates_intentions_deepfaked_detected) %>%
  mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter MAP_Estimate CI_95_lower CI_95_upper CI_90_lower CI_90_upper p
effect_deepfaked 1.37 0.99 1.76 1.05 1.69 0
effect_deepfaked_detected 2.77 1.88 3.52 2.03 3.38 0
# hypothesis testing
H6c <- ifelse((estimates_intentions_deepfaked_detected %>% filter(Parameter == "effect_deepfaked_detected") %>%
                 pull(CI_95_lower)) > 0, 
              "Accepted", "Rejected")

H6c

In the subset of participants who were shown a Deepfaked video and accurately detected that the video was Deepfaked, the content of the Deepfaked videos (i.e., Source Valence) will influence participants’ behavioral intention scores. Specifically, we will use a Bayesian linear model (model 4) to estimate a 95% Confidence Interval on standardized effect size change in behavioral intention scores between Source Valence conditions in the Deepfaked video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis.

  • Result: Accepted

RQ7: Does both being aware of the concept of Deepfaking and correcting diagnosing that content is Deepfaked make you immune to its influence?

Subset who received deepfaked videos, were aware of the concept before the study, and also detected them. Same Bayesian multilevel models as employed above, using only source_valence as IV, i.e., to detect whether learning effects are credibly non-zero in this subset.

Sample sizes

data_aware_detectors_subset_n <- data_after_exclusions %>%
  filter(experiment_condition == "deepfaked" & 
           !is.na(deepfake_detection_open_recoded) & 
           !is.na(deepfake_awareness_open_recoded)) %>%
  count(deepfake_detection_open_recoded, deepfake_awareness_open_recoded) %>%
  mutate(proportion = round(n/sum(n), 2)) %>%
  arrange(desc(proportion))

data_aware_detectors_subset_n %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
deepfake_detection_open_recoded deepfake_awareness_open_recoded n proportion
FALSE TRUE 82 0.40
FALSE FALSE 81 0.39
TRUE TRUE 32 0.15
TRUE FALSE 12 0.06
data_aware_detectors_subset <- data_after_exclusions %>%
  filter(experiment_condition == "deepfaked" & deepfake_detection_open_recoded == TRUE & deepfake_awareness_open_recoded == TRUE)

Self-reported evaluations

Fit model

fit_exploratory_selfreport_deepfaked_aware_detected <-
  brm(formula = mean_self_reported_evaluation ~ source_valence + (1 | experiment),
      family  = gaussian(),
      data    = data_aware_detectors_subset,
      file    = "models/fit_exploratory_selfreport_deepfaked_aware_detected",
      prior   = prior(normal(0, 10)),
      iter    = 10000,
      warmup  = 3000,
      control = list(adapt_delta = 0.99),  # to avoid divergent transitions
      chains  = 4,
      cores   = parallel::detectCores())

Inspect convergence

summary(fit_exploratory_selfreport_deepfaked_aware_detected)
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: mean_self_reported_evaluation ~ source_valence + (1 | experiment) 
##    Data: data_aware_detectors_subset (Number of observations: 32) 
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
##          total post-warmup samples = 28000
## 
## Group-Level Effects: 
## ~experiment (Number of levels: 2) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.92      1.04     0.02     3.76 1.00     6834    11383
## 
## Population-Level Effects: 
##                        Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                 -1.67      0.79    -3.40    -0.03 1.00     9078
## source_valencepositive     3.26      0.48     2.30     4.22 1.00    16897
##                        Tail_ESS
## Intercept                  8457
## source_valencepositive    15216
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     1.34      0.18     1.04     1.76 1.00    17130    15103
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit_exploratory_selfreport_deepfaked_aware_detected, ask = FALSE)

Check informativeness of prior

Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.

check_prior(fit_exploratory_selfreport_deepfaked_aware_detected) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter Prior_Quality
b_Intercept uninformative
b_source_valencepositive uninformative

Interpret posteriors

#plot_model(fit_exploratory_selfreport_deepfaked_aware_detected)
plot_model(fit_exploratory_selfreport_deepfaked_aware_detected, type = "pred", terms = "source_valence")

# results table
draws_sr_deepfaked_aware_detected <-
  select(spread_draws(fit_exploratory_selfreport_deepfaked_aware_detected, b_source_valencepositive), b_source_valencepositive) %>%
  rename(effect_deepfaked_aware_detected = b_source_valencepositive)

estimates_sr_deepfaked_aware_detected <-
  map_estimate(draws_sr_deepfaked_aware_detected) %>%
  full_join(bayestestR::hdi(draws_sr_deepfaked_aware_detected, ci = .95) %>%
              rename(CI_95_lower = CI_low,
                     CI_95_upper = CI_high) %>%
              as_tibble(),
            by = "Parameter") %>%
  full_join(bayestestR::hdi(draws_sr_deepfaked_aware_detected, ci = .90) %>%
              as_tibble() %>%
              rename(CI_90_lower = CI_low,
                     CI_90_upper = CI_high),
            by = "Parameter") %>%
  full_join(draws_sr_deepfaked_aware_detected %>%
              gather(Parameter, value) %>%
              group_by(Parameter) %>%
              summarize(pd = mean(value > 0)) %>%
              mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
              ungroup() %>%
              select(Parameter, p),
            by = "Parameter") %>%
  select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper, 
         CI_90_lower, CI_90_upper, p) 

bind_rows(filter(estimates_sr, Parameter %in% c("effect_deepfaked")),
          estimates_sr_deepfaked_aware_detected) %>%
  mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter MAP_Estimate CI_95_lower CI_95_upper CI_90_lower CI_90_upper p
effect_deepfaked 2.80 2.63 2.96 2.66 2.93 0
effect_deepfaked_aware_detected 3.25 2.35 4.26 2.48 4.06 0
# hypothesis testing
H7a <- ifelse((estimates_sr_deepfaked_aware_detected %>% filter(Parameter == "effect_deepfaked_aware_detected") %>%
                 pull(CI_95_lower)) > 0, 
              "Accepted", "Rejected")

H7a

In the subset of participants who were shown a Deepfaked video and accurately detected that the video was Deepfaked, the content of the Deepfaked videos (i.e., Source Valence) will influence participants’ self-reported evaluations. Specifically, we will use a Bayesian linear model (model 3) to estimate a 95% Confidence Interval on standardized effect size change in self-reported evaluations between Source Valence conditions in the genuine video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis. ######### needs editing #########

  • Result: Accepted

Implicit

Fit model

fit_exploratory_implicit_deepfaked_aware_detected <-
  brm(formula = IAT_D2 ~ source_valence + (1 | experiment),
      family  = gaussian(),
      data    = data_aware_detectors_subset,
      file    = "models/fit_exploratory_implicit_deepfaked_aware_detected",
      prior   = prior(normal(0, 10)),
      iter    = 10000,
      warmup  = 3000,
      control = list(adapt_delta = 0.99),  # to avoid divergent transitions
      chains  = 4,
      cores   = parallel::detectCores())

Inspect convergence

summary(fit_exploratory_implicit_deepfaked_aware_detected)
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: IAT_D2 ~ source_valence + (1 | experiment) 
##    Data: data_aware_detectors_subset (Number of observations: 32) 
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
##          total post-warmup samples = 28000
## 
## Group-Level Effects: 
## ~experiment (Number of levels: 2) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     1.37      1.28     0.12     4.66 1.00     6234     7638
## 
## Population-Level Effects: 
##                        Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                 -0.24      0.98    -2.39     1.74 1.00     7766
## source_valencepositive     1.23      0.34     0.56     1.90 1.00    15998
##                        Tail_ESS
## Intercept                  7573
## source_valencepositive    12518
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     0.94      0.13     0.73     1.24 1.00    13717    14561
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit_exploratory_implicit_deepfaked_aware_detected, ask = FALSE)

Check informativeness of prior

Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.

check_prior(fit_exploratory_implicit_deepfaked_aware_detected) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter Prior_Quality
b_Intercept uninformative
b_source_valencepositive uninformative

Interpret posteriors

#plot_model(fit_exploratory_implicit_deepfaked_aware_detected)
plot_model(fit_exploratory_implicit_deepfaked_aware_detected, type = "pred", terms = "source_valence")

# results table
draws_imp_deepfaked_aware_detected <-
  select(spread_draws(fit_exploratory_implicit_deepfaked_aware_detected, b_source_valencepositive), b_source_valencepositive) %>%
  rename(effect_deepfaked_aware_detected = b_source_valencepositive)

estimates_imp_deepfaked_aware_detected <-
  map_estimate(draws_imp_deepfaked_aware_detected) %>%
  full_join(bayestestR::hdi(draws_imp_deepfaked_aware_detected, ci = .95) %>%
              rename(CI_95_lower = CI_low,
                     CI_95_upper = CI_high) %>%
              as_tibble(),
            by = "Parameter") %>%
  full_join(bayestestR::hdi(draws_imp_deepfaked_aware_detected, ci = .90) %>%
              as_tibble() %>%
              rename(CI_90_lower = CI_low,
                     CI_90_upper = CI_high),
            by = "Parameter") %>%
  full_join(draws_imp_deepfaked_aware_detected %>%
              gather(Parameter, value) %>%
              group_by(Parameter) %>%
              summarize(pd = mean(value > 0)) %>%
              mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
              ungroup() %>%
              select(Parameter, p),
            by = "Parameter") %>%
  select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper, 
         CI_90_lower, CI_90_upper, p) 

bind_rows(filter(estimates_imp, Parameter %in% c("effect_deepfaked")),
          estimates_imp_deepfaked_aware_detected) %>%
  mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter MAP_Estimate CI_95_lower CI_95_upper CI_90_lower CI_90_upper p
effect_deepfaked 1.41 1.23 1.55 1.26 1.53 0.0000000
effect_deepfaked_aware_detected 1.22 0.54 1.88 0.68 1.78 0.0003214
# hypothesis testing
H7b <- ifelse((estimates_imp_deepfaked_aware_detected %>% filter(Parameter == "effect_deepfaked_aware_detected") %>%
                 pull(CI_95_lower)) > 0, 
              "Accepted", "Rejected")

H7b

In the subset of participants who were shown a Deepfaked video and accurately detected that the video was Deepfaked, the content of the Deepfaked videos (i.e., Source Valence) will influence participants’ IAT D2 scores. Specifically, we will use a Bayesian linear model (model 4) to estimate a 95% Confidence Interval on standardized effect size change in IAT D2 scores between Source Valence conditions in the Deepfaked video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis. ######### needs editing #########

  • Result: Accepted

Behavioural intentions

Fit model

fit_exploratory_intentions_deepfaked_aware_detected <-
  brm(formula = mean_intentions ~ source_valence, # no random effect for experiment as only exp 6 assessed intentions
      family  = gaussian(),
      data    = data_aware_detectors_subset,
      file    = "models/fit_exploratory_intentions_deepfaked_aware_detected",
      prior   = prior(normal(0, 10)),
      iter    = 10000,
      warmup  = 3000,
      control = list(adapt_delta = 0.99),  # to avoid divergent transitions
      chains  = 4,
      cores   = parallel::detectCores())

Inspect convergence

summary(fit_exploratory_intentions_deepfaked_aware_detected)
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: mean_intentions ~ source_valence 
##    Data: data_aware_detectors_subset (Number of observations: 17) 
## Samples: 4 chains, each with iter = 10000; warmup = 3000; thin = 1;
##          total post-warmup samples = 28000
## 
## Population-Level Effects: 
##                        Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                 -3.15      0.36    -3.86    -2.43 1.00    17518
## source_valencepositive     2.46      0.53     1.39     3.52 1.00    17793
##                        Tail_ESS
## Intercept                 14708
## source_valencepositive    14680
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     1.08      0.22     0.75     1.60 1.00    15668    15286
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit_exploratory_intentions_deepfaked_aware_detected, ask = FALSE)

Check informativeness of prior

Using Gelman’s (2019) simple heuristic: For each parameter, compare the posterior sd to the prior sd. If the posterior sd for any parameter is more than 0.1 times the prior sd, then note that the prior was informative.

check_prior(fit_exploratory_intentions_deepfaked_aware_detected) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter Prior_Quality
b_Intercept uninformative
b_source_valencepositive uninformative

Interpret posteriors

#plot_model(fit_exploratory_intentions_deepfaked_aware_detected)
plot_model(fit_exploratory_intentions_deepfaked_aware_detected, type = "pred", terms = "source_valence")

# results table
draws_intentions_deepfaked_aware_detected <-
  select(spread_draws(fit_exploratory_intentions_deepfaked_aware_detected, b_source_valencepositive), b_source_valencepositive) %>%
  rename(effect_deepfaked_aware_detected = b_source_valencepositive)

estimates_intentions_deepfaked_aware_detected <-
  map_estimate(draws_intentions_deepfaked_aware_detected) %>%
  full_join(bayestestR::hdi(draws_intentions_deepfaked_aware_detected, ci = .95) %>%
              rename(CI_95_lower = CI_low,
                     CI_95_upper = CI_high) %>%
              as_tibble(),
            by = "Parameter") %>%
  full_join(bayestestR::hdi(draws_intentions_deepfaked_aware_detected, ci = .90) %>%
              as_tibble() %>%
              rename(CI_90_lower = CI_low,
                     CI_90_upper = CI_high),
            by = "Parameter") %>%
  full_join(draws_intentions_deepfaked_aware_detected %>%
              gather(Parameter, value) %>%
              group_by(Parameter) %>%
              summarize(pd = mean(value > 0)) %>%
              mutate(p = pd_to_p(pd, direction = "one-sided")) %>%
              ungroup() %>%
              select(Parameter, p),
            by = "Parameter") %>%
  select(Parameter, MAP_Estimate, CI_95_lower, CI_95_upper, 
         CI_90_lower, CI_90_upper, p)

bind_rows(filter(estimates_intentions, Parameter %in% c("effect_deepfaked")),
          estimates_intentions_deepfaked_aware_detected) %>%
  mutate_at(.vars = c("MAP_Estimate", "CI_95_lower", "CI_95_upper", "CI_90_lower", "CI_90_upper"), round, digits = 2) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE)
Parameter MAP_Estimate CI_95_lower CI_95_upper CI_90_lower CI_90_upper p
effect_deepfaked 1.37 0.99 1.76 1.05 1.69 0
effect_deepfaked_aware_detected 2.44 1.39 3.52 1.58 3.32 0
# hypothesis testing
H7c <- ifelse((estimates_intentions_deepfaked_aware_detected %>% filter(Parameter == "effect_deepfaked_aware_detected") %>%
                 pull(CI_95_lower)) > 0, 
              "Accepted", "Rejected")

H7c

In the subset of participants who were shown a Deepfaked video and accurately detected that the video was Deepfaked, the content of the Deepfaked videos (i.e., Source Valence) will influence participants’ behavioral intention scores. Specifically, we will use a Bayesian linear model (model 4) to estimate a 95% Confidence Interval on standardized effect size change in behavioral intention scores between Source Valence conditions in the Deepfaked video condition subgroup. Confidence Intervals whose lower bounds are > 0 will be considered evidence in support of this hypothesis. ######### needs editing #########

  • Result: Accepted

Summary of hypothesis tests

H1: Establishing first impressions via online video content

  • Genuine content can establish self-reported evaluations (Accepted), implicit evaluations (Accepted), and behavioural intentions (Accepted).
  • Deepfaked content can establish self-reported evaluations (Accepted), implicit evaluations (Accepted), and behavioural intentions (Accepted).

H2: Are deepfakes just as good as the real thing?

  • Deepfakes are non-inferior to genuine content on self-reported evaluations (Accepted), implicit evaluations (Accepted), and behavioural intentions (Accepted).

H3: How good are people at detecting whether content is genuine or Deepfaked?

Whole sample

classifications %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE)
variable observed ci_lower ci_upper
balanced_accuracy 0.59 0.56 0.62
false_negative_rate 0.73 0.69 0.78
false_positive_rate 0.08 0.04 0.12
informedness 0.19 0.13 0.25

Those who were aware of the concept prior to the study

classifications_subset %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE)
variable observed ci_lower ci_upper
balanced_accuracy 0.59 0.54 0.64
false_negative_rate 0.73 0.64 0.80
false_positive_rate 0.08 0.05 0.15
informedness 0.19 0.08 0.28

H4: Does prior awareness of the concept of Deepfakes make people better at detecting them?

  • Percent aware of the concept of Deepfakes: 53.5
  • Of those exposed to a Deepfake, Individuals who were aware of the concept of Deepfakes prior to participating in the study were 2.6 times more likely to detect that they had been shown a deepfake than those who were not aware of the concept (Incidence Rate Ratio = 2.58, 95% CI [1.27, 5.59]), Accepted.

H5-7: Does being aware of the concept, detecitng the deepfake, or both make you immune to a Deepfake?

  • H5: Evaluative learning effects found in the subset who were shown Deepfakes and were aware of the concept, on self-reports (Accepted), implicit measure (Accepted) and behavioural intentions (Accepted).
  • H6: Evaluative learning effects found in the subset who were shown Deepfakes and detected them, on self-reports (Accepted), implicit measure (Accepted) and behavioural intentions (Accepted).
  • H6: Evaluative learning effects found in the subset who were shown Deepfakes, were aware of the concept, and detected them, on self-reports (Accepted), implicit measure (Accepted) and behavioural intentions (Accepted).

Session Info

sessionInfo()
## R version 4.0.2 (2020-06-22)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS  10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_IE.UTF-8/en_IE.UTF-8/en_IE.UTF-8/C/en_IE.UTF-8/en_IE.UTF-8
## 
## attached base packages:
## [1] parallel  stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] furrr_0.2.1      future_1.19.1    modelr_0.1.8     semTools_0.5-3  
##  [5] lavaan_0.6-7     IATscores_0.2.7  broom_0.7.2      rsample_0.0.7   
##  [9] psych_2.0.9      sjPlot_2.8.4     bayestestR_0.7.5 tidybayes_2.0.3 
## [13] brms_2.14.0      Rcpp_1.0.5       kableExtra_1.3.1 knitr_1.30      
## [17] forcats_0.5.0    stringr_1.4.0    dplyr_1.0.2      purrr_0.3.4     
## [21] readr_1.3.1      tidyr_1.1.2      tibble_3.0.4     ggplot2_3.3.2   
## [25] tidyverse_1.3.0 
## 
## loaded via a namespace (and not attached):
##   [1] tidyselect_1.1.0     lme4_1.1-25          htmlwidgets_1.5.1   
##   [4] grid_4.0.2           munsell_0.5.0        codetools_0.2-16    
##   [7] effectsize_0.4.0     statmod_1.4.34       DT_0.13             
##  [10] miniUI_0.1.1.1       withr_2.3.0          Brobdingnag_1.2-6   
##  [13] colorspace_2.0-0     highr_0.8            rstudioapi_0.13     
##  [16] stats4_4.0.2         bayesplot_1.7.2      listenv_0.8.0       
##  [19] labeling_0.4.2       huge_1.3.4.1         emmeans_1.4.6       
##  [22] rstan_2.21.2         mnormt_1.5-7         farver_2.0.3        
##  [25] bridgesampling_1.0-0 coda_0.19-3          vctrs_0.3.5         
##  [28] generics_0.0.2       TH.data_1.0-10       xfun_0.19           
##  [31] R6_2.5.0             markdown_1.1         assertthat_0.2.1    
##  [34] promises_1.1.0       scales_1.1.1         multcomp_1.4-13     
##  [37] nnet_7.3-14          gtable_0.3.0         globals_0.13.1      
##  [40] processx_3.4.4       sandwich_2.5-1       rlang_0.4.8         
##  [43] splines_4.0.2        checkmate_2.0.0      inline_0.3.16       
##  [46] yaml_2.2.1           reshape2_1.4.4       abind_1.4-5         
##  [49] d3Network_0.5.2.1    threejs_0.3.3        crosstalk_1.1.0.1   
##  [52] backports_1.1.9      httpuv_1.5.2         rsconnect_0.8.16    
##  [55] Hmisc_4.4-1          tools_4.0.2          ellipsis_0.3.1      
##  [58] RColorBrewer_1.1-2   ggridges_0.5.2       plyr_1.8.6          
##  [61] base64enc_0.1-3      ps_1.4.0             prettyunits_1.1.1   
##  [64] rpart_4.1-15         pbapply_1.4-2        zoo_1.8-8           
##  [67] qgraph_1.6.5         haven_2.3.1          cluster_2.1.0       
##  [70] fs_1.4.1             magrittr_2.0.1       data.table_1.13.2   
##  [73] colourpicker_1.0     reprex_0.3.0         mvtnorm_1.1-1       
##  [76] whisker_0.4          sjmisc_2.8.5         matrixStats_0.56.0  
##  [79] hms_0.5.3            shinyjs_1.1          mime_0.9            
##  [82] evaluate_0.14        arrayhelpers_1.1-0   xtable_1.8-4        
##  [85] shinystan_2.5.0      sjstats_0.18.0       jpeg_0.1-8.1        
##  [88] readxl_1.3.1         gridExtra_2.3        ggeffects_0.14.3    
##  [91] rstantools_2.1.1     compiler_4.0.2       V8_3.2.0            
##  [94] crayon_1.3.4         minqa_1.2.4          StanHeaders_2.21.0-6
##  [97] htmltools_0.5.0      corpcor_1.6.9        later_1.0.0         
## [100] Formula_1.2-3        RcppParallel_5.0.2   lubridate_1.7.9     
## [103] DBI_1.1.0            sjlabelled_1.1.7     dbplyr_1.4.3        
## [106] MASS_7.3-53          boot_1.3-25          Matrix_1.2-18       
## [109] cli_2.1.0            insight_0.10.0       igraph_1.2.5        
## [112] BDgraph_2.62         pkgconfig_2.0.3      foreign_0.8-80      
## [115] xml2_1.3.2           svUnit_1.0.3         pbivnorm_0.6.0      
## [118] dygraphs_1.1.1.6     webshot_0.5.2        estimability_1.3    
## [121] rvest_0.3.5          snakecase_0.11.0     callr_3.5.1         
## [124] digest_0.6.27        parameters_0.8.6     rmarkdown_2.5       
## [127] cellranger_1.1.0     htmlTable_1.13.3     curl_4.3            
## [130] shiny_1.5.0          gtools_3.8.2         rjson_0.2.20        
## [133] nloptr_1.2.2.2       glasso_1.11          lifecycle_0.2.0     
## [136] nlme_3.1-148         jsonlite_1.7.1       viridisLite_0.3.0   
## [139] fansi_0.4.1          pillar_1.4.6         lattice_0.20-41     
## [142] loo_2.3.1            fastmap_1.0.1        httr_1.4.1          
## [145] pkgbuild_1.1.0       survival_3.1-12      glue_1.4.2          
## [148] xts_0.12-0           fdrtool_1.2.15       png_0.1-7           
## [151] shinythemes_1.1.2    stringi_1.4.6        performance_0.4.6   
## [154] latticeExtra_0.6-29